Building Specialized Bilingual Lexicons Using Word Sense Disambiguation

نویسندگان

  • Dhouha Bouamor
  • Nasredine Semmar
  • Pierre Zweigenbaum
چکیده

This paper presents an extension of the standard approach used for bilingual lexicon extraction from comparable corpora. We study the ambiguity problem revealed by the seed bilingual dictionary used to translate context vectors and augment the standard approach by a Word Sense Disambiguation process. Our aim is to identify the translations of words that are more likely to give the best representation of words in the target language. On two specialized French-English and RomanianEnglish comparable corpora, empirical experimental results show that the proposed method consistently outperforms the standard approach.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Using Parallel Texts and Lexicons for Verbal Word Sense Disambiguation

We present a system for verbal Word Sense Disambiguation (WSD) that is able to exploit additional information from parallel texts and lexicons. It is an extension of our previous WSD method (Dušek et al., 2014), which gave promising results but used only monolingual features. In the follow-up work described here, we have explored two additional ideas: using English-Czech bilingual resources (as...

متن کامل

Building A Chinese WordNet Via Class-Based Translation Model

Semantic lexicons are indispensable to research in lexical semantics and word sense disambiguation (WSD). For the study of WSD for English text, researchers have been using different kinds of lexicographic resources, including machine readable dictionaries (MRDs), machine readable thesauri, and bilingual corpora. In recent years, WordNet has become the most widely used resource for the study of...

متن کامل

Context Vector Disambiguation for Bilingual Lexicon Extraction from Comparable Corpora

This paper presents an approach that extends the standard approach used for bilingual lexicon extraction from comparable corpora. We focus on the unresolved problem of polysemous words revealed by the bilingual dictionary and introduce a use of a Word Sense Disambiguation process that aims at improving the adequacy of context vectors. On two specialized FrenchEnglish comparable corpora, empiric...

متن کامل

Towards a Generic Approach for Bilingual Lexicon Extraction from Comparable Corpora

This paper presents an approach that extends the standard approach used for bilingual lexicon extraction from comparable corpora. We focus on the problem associated to polysemous words found in the seed bilingual lexicon when translating source context vectors. To improve the adequacy of context vectors, the use of a WordNetbased Word Sense Disambiguation process is tested. Experimental results...

متن کامل

Building A Training Corpus For Word Sense Disambiguation In English-To-Vietnamese Machine Translation

The most difficult task in machine translation is the elimination of ambiguity in human languages. A certain word in English as well as Vietnamese often has different meanings which depend on their syntactical position in the sentence and the actual context. In order to solve this ambiguation, formerly, people used to resort to many hand-coded rules. Nevertheless, manually building these rules ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013